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Abstract 

We consider the problem of finding a spanning tree with maximum number of leaves 
(MaxLeaf). A 2-approximation algorithm is known for this problem, and a 3/2-approximation 
algorithm when restricted to graphs where every vertex has degree 3 {cubic graphs). 
MaxLeaf is known to be APX-hard in general, and NP-hard for cubic graphs. We show 
^ ' that the problem is also APX-hard for cubic graphs. The APX-hardness of the related 

O . problem Minimum Connected Dominating Set for cubic graphs follows. 

^ ■ 1 Introduction 

^ : 

■ We study the problem Maximum Leaf Spanning Tree or MaxLeaf, for which the objective 

, is to find in a given connected graph a spanning tree with maximum number of leaves. An 

a- approximation algorithm for a maximization (minimization) problem is a polynomial time 
algorithm that returns a solution with objective value at least (at most) a ■ OPT, where OPT 
. is the objective value of an optimal solution for the given instanc^. MaxLeaf is known to be 

APX-hard [12], which implies that there exists an e > such that no polynomial time (1 — e)- 
approximation algorithm is possible for this problem, unless P=NP [2]. However, constant 
^ ■ factor approximation algorithms are known: Lu and Ravi [20] gave a 1/3-approximation, and 

this was later improved by Solis-Oba who gave a linear time 1/2-approximation [23]. So the 
problem is in APX ~ the class of optimization problems with constant factor approximation 
algorithms - and therefore APX-complete. 

MaxLeaf is closely related to Minimum Connected Dominating Set (MinCDS). This prob- 
lem asks, given a graph G, for a set S C y{G) of minimum size such that G[S] is connected 
and every vertex v ^ S is adjacent to a vertex in S (a connected dominating set). The rela- 
tion between these problems is as follows: since the non-leaves of a spanning tree of G form a 
connected dominating set (unless G = K2), G has a spanning tree with at least k leaves if and 
only if G has a connected dominating set of size at most |y(G)| — k. These problems differ 
from an approximability viewpoint: Guha and Khuller [14j showed that MinCDS admits no 
constant factor approximation algorithm under established complexity-theoretic assumptions. 
Ruan et al [22] give a 2 + In A(G)-approximation algorithm, where A(G) is the maximum 
degree of G. 

In cubic graphs, every vertex has degree 3. The restriction of MaxLeaf to cubic graphs 
has received much attention. One reason is that these are easier to analyze algorithmically. 
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^In the literature on MaxLeaf, approximation algorithms are usually stated with a > 1 approximation 
ratios. For our proofs it is more convenient to define these as l/a-approximation algorithms. 
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yet from an approximation viewpoint, this is where the main hardness hes. For instance, 
for 5-regular graphs a 2/3-approximation follows easily from known bounds [13], see below. 
For cubic graphs, more work is required to obtain this ratio: Lorys and Zwozniak [TB] gave 
a 4/7-approximation for MaxLeaf on cubic graphs. This ratio was later improved to 3/5 by 
Correa et al [B], and finally by Bonsma and Zickfeld [1] to 2/3. A natural question is how 
far this can be improved. However, even the question whether the problem is APX-hard 
for cubic graphs remained open. This question was asked in [6] and [4j. The only known 
hardness result for cubic graphs is that the problem is NP-hard, as was shown by Lemke in 
an unpublished technical report [T7] . 

In this paper we settle the question by showing that also for cubic graphs, the problem is 
APX-hard. This is strictly stronger than the known hardness results [171 112j. From this the 
APX-hardness of MinCDS for cubic graphs will also follow. The proof is interesting by itself, 
since it shows how APX-hardness results can be proved using extremal arguments. Informally 
speaking, the problem with proving APX-hardness for cubic graphs is that it seems impossible 
to find 'well-behaved' gadgets, that allow for an easy analysis of the graph constructed in the 
reduction. Instead we have a simple construction, but need an elaborate global analysis of 
the constructed graph, involving various (fractional) bounds and rounding arguments. As a 
contrast, we give a new very simple and more traditional APX-hardness proof for MaxLeaf 
in general graphs in at the end of this introduction. 

APX-hardness results for basic problems in restricted graph classes, in particular cu- 
bic graphs, are useful since they allow for simple hardness proofs of many other problems. 
The four hardness results by Alimonti and Kann [T] have often been used for this purpose: 
they show that the problems Minimum Vertex Cover, Maximum Independent Set, Minimum 
Dominating Set and Maximum Cut are APX-hard when restricted to cubic graphs. Their 
APX-hardness results for Maximum Independent Set and Minimum Vertex Cover will be used 
for the two reductions in this paper. 

We now review some algorithmic results on MaxLeaf. Recently, the generalization of 
MaxLeaf to directed graphs or digraphs has received a lot of attention. Very recently Daligault 
and Thomasse [7] gave a constant factor approximation algorithm for this problem (more 
precisely, a 1/92-approximation algorithm), improving on the r2(l/\/OPT)-approximation 
of Drescher and Vetta [^. The paper of Daligault and Thomasse [7] also deals with the 
parameterized variant of the decision version of Directed MaxLeaf. See \10\ [T6] for other 
parameterized results on (un)directed MaxLeaf. Undirected MaxLeaf has also been studied 
in the area of fast exact algorithms. Fomin et al [11] gave an algorithm for finding a minimum 
connected dominating set, and therefore a maximum leaf spanning tree, that runs in time 
0(1.9407") where n is the number of vertices. 

Combinatorial bounds form an important ingredient for many of the above results. For 
instance, it is known that connected graphs with minimum degree (5 > 3 on n vertices admit 
a spanning tree with at least n/4 -|- 2 leaves [15]. A stronger version of this bound appears 
in [5]. For cubic graphs, see [^ for an improved bound. When 5 > 4, 2n/5 -|- 8/5 leaves are 
possible [15l[13], and for 5 > 5, n/2 + 2 leaves are possible [13]. In [3] and [7] bounds for the 
directed case can be found. 

One may wonder why it is much harder to prove APX-hardness for cubic graphs than 
it is to prove NP-hardness for cubic graphs [17] or APX-hardness for general graphs [l2j . 
Indeed, for general graphs a very simple APX-hardness proof can be given, using a reduction 
from the APX-hard problem Cubic Minimum Vertex Cover: let G be a cubic graph on n 
vertices and m = |n edges for which we search a minimum vertex cover, i.e. a minimum 
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set S C V{G) such that every edge of G is incident with some vertex of S. Let k be the 
size of a minimum vertex cover. Construct a MaxLeaf instance G' as follows: introduce a 
new vertex x, and add edges from x to every other vertex. Next, subdivide every edge not 
incident with x with a single vertex. It can be checked that any spanning tree in G' can 
be transformed into a spanning tree with at least as many leaves, where all the degree 2 
vertices are leaves. From this it follows that G has a vertex cover with at most y vertices 
if and only if G' has a spanning tree with at least n — y -\- m leaves. Since G is cubic, 
k > m/3. A (1 — e)-approximation algorithm for MaxLeaf now yields a solution with at least 
{1 — e){n — k-\-m) = n — k + m — e{5/3m — k) > n — k + m — e{5k — k) = n— {l + Ae)k + m leaves, 
and therefore a vertex cover of size at most (1 +4e) A;. This concludes the APX-hardness proof. 

It seems however impossible to give a similar simple proof for cubic graphs. Considering 
the NP-hardness proof for cubic graphs, Lemke [T7j gave a reduction from Exact Cover by 
3-Sets. Here a 3-uniform hypergraph G on n vertices is given (i.e. all edges contain three 
vertices). The question is whether there is a subset of the edges Q such that every vertex is 
contained in exactly one edge of Q. For every instance G, in [T7] a graph is constructed that 
has a spanning tree without vertices of degree 2 if and only if G is a 'yes'-instance. It is easily 
seen that such a tree is optimal. However, an approximation preserving reduction from an 
APX-hard problem needs also to take into account cases where the tree is not optimal, that 
is, it contains some degree 2 vertices. In this case the behavior of the subgraphs in Lemke's 
construction, or even any cubic construction, becomes much harder to analyze. 

In Section [2] we give definitions and notations, and in Section [3] the construction of our 
APX-hardness proof, which uses an approximation preserving reduction from Cubic Maximum 
Independent Set. Sections H] and [5] show how leafy spanning trees yield large independent sets 
and vice versa, and in Section [6] these bounds are combined to conclude the proof. 

2 Preliminaries 

For basic graph theoretic definitions, we follow [8|. By dQ[v) we denote the degree of v in 
graph G. The subscript is omitted when the graph in question is clear. By 5{G) and A(G) 
we denote the minimum and maximum degree of G, respectively. By G — S we denote the 
graph obtained from G by deleting the vertex or edge set S. 

A directed graph or digraph D consists of a vertex set V{D) and arc set A{D), which is a 
set of ordered 2-tuples of vertices. For an arc (u, v) S A{D), u is called the tail and v the head 
of {u,v). The in-degree d~{v) (out-degree d^{v)) of a vertex v is the number of arcs of which 
V is the head (tail). A directed graph (digraph) D is an orientation of an undirected graph G 
if V(D) = V(G) and there exists a bijection / : A(D) E(G) with f((u,v)) = {u,v] for ah 
(li, v) G ^.(0). An out-tree orientation of a tree T is an orientation T' of the (given undirected) 
tree T such that T' is an out-tree, that is, there is exactly one vertex with in-degree 0, which 
is called the root. Note that every other vertex then has in-degree 1. 

A vertex sequence uq, . . . , ffc is called a path or cycle in a digraph D if it is a path or cycle 
in the underlying undirected graph (i.e. (vi,Vi+i) € -A.(D) or (viJ^i,Vi) € A(D) holds for all 
i). Directed paths and cycles, where (vi,ViJ^i) S A(D) holds for all i are called dipaths and 
dicycles. A path from n to f is also called a (u,v)-path. In an undirected graph G, v is said 
to be reachable from n if a (u,v)-p&th. exists in G. In a digraph D, v is reachable from n if a 
(u, i))-dipath exists. 

An induced subgraph H of an undirected graph G is called a k-terminal subgraph if H 
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contains exactly k vertices that have neighbors outside of H, these are called its terminals. 



3 The Construction of a Weighted MaxLeaf Instance 

We now prove that Cubic MaxLeaf is APX-hard (and thus APX-complete) , using a reduction 
from Cubic Maximum Independent Set (Cubic MIS). This problem has as input a cubic graph 
G, and asks for a maximum size set S C V{G) such that no two vertices in S are adjacent. 
To improve the presentation, we will prove that the following problem variant is APX-hard, 
from which APX-hardness of cubic MaxLeaf easily follows. The problem Weighted MaxLeaf 
has as input a graph G with A(G) < 3 and S{G) > 2, and the objective is to find a spanning 
tree T that maximizes the number of vertices v with dxiv) = 1 and dciv) = 3. We will also 
call vertices of G with degree 3 weighted vertices and the other vertices unweighted. So the 
objective is to maximize the number of weighted leaves. By i(T) we will denote the number £{T) 
of weighted leaves of T. 

From instances G of Weighted MaxLeaf, it is easy to construct equivalent Cubic MaxLeaf 
instances: replace every vertex of degree 2 by the 1-terminal subgraph as shown in Figure [T]|^a) 
(the two half edges indicate the terminal). The next lemma is easily observed. 

Lemma 1 Let G' be the cubic graph obtained from a graph G with S{G) = 2, A(G) = 3 by 
replacing all x vertices of degree 2 as shown in Figure\^a). Then G' has a spanning tree with 
at least I + leaves if and only if G has a spanning tree with at least I weighted leaves. 

The construction of Weighted MaxLeaf instances uses the following gadgets. A vertex 
gadget of G is an induced 4-terminal subgraph of G as shown in Figure [Dj^b), where the four vertex 
terminals are indicated by half edges. Note that one vertex has degree 2, and therefore does gadget 
not count towards the number weighted leaves. 



4 



Construction Let G be a Cubic MIS instance on n vertices. We use this to construct G 
in polynomial time a weighted MaxLeaf instance as follows. First, we assume w.l.o.g. that 
G / K4, and thus we can construct a proper 3-coloring of G, using colors red, green and blue. 
(By Brooks' Theorem such a coloring exists, and in addition it can be found in polynomial 
time, see also |19j.) Let r and g be the number of red and green vertices respectively, and r 
w.l.o.g. assume r > 1 and g > I. Number the vertices of uq, • • • , Vn-i such that vq, . . . , Vr-i ^ 
are red, Vr, ■ ■ ■ ,Vr+g~i are green, and u^+g, • • • , fn-i are blue. We construct a graph Q as 
follows. The construction is illustrated in Figure [2j 

1. Start with G. Add a cycle consisting of n connection vertices cq, . . . ,c„_i and edges 
CiC(i+i) mod n for i G {0, . . . , n - 1}. 

2. Add edges ViCi for alH G {0, . . . , n — 1}. 

3. Subdivide every edge with one new vertex (of degree 2). 

4. Replace every vertex Vi of degree four with a vertex gadget Hi, such that every terminal Hi 
of Hi becomes adjacent to a different neighbor of Vi. (Choose arbitrarily which terminals 
become adjacent to which neighbors.) 

Let Q be the resulting graph, and let G' be the graph obtained after Step 2 in this construction. Q 
Recall that by our definition of Weighted MaxLeaf, the vertices introduced in Step 3 do not G' 
count towards the number of weighted leaves. For the proofs below it will be useful to denote 
how end vertices of edges of G' correspond to vertices of Q. In Step 3, edges uv of G' are 
subdivided with a new vertex w to yield two edges uw and vw. In Step 4, the edge uw may 
be replaced by an edge tw, where t is a terminal of a vertex gadget. If this is the case, tuv{u) tuv{u) 
will denote this terminal t, otherwise tuv{u) will denote u. 

We will proceed to show that for every x G R, if ^ has a spanning tree with at least 
3.75n + 1.5x weighted leaves, then G has an independent set of size at least x — ^ (SectionH]), 
which can be constructed in polynomial time. In addition, if G has an independent set of size 
X, Q has a spanning tree with at least [3.75n + 1.5xJ weighted leaves (Section[5]). In Section[6] 
it is then shown that this yields a (1 — 141e)-approximation algorithm for Cubic MIS, when a 
(1 — e)-approximation algorithm for Cubic MaxLeaf is given. This proves APX-hardness for 
Cubic MaxLeaf. 



4 Constructing an Independent Set from a Spanning Tree 

We first take a closer look at the behavior of vertex gadgets, by bounding the number of 
weighted leaves a spanning tree may contain within one given vertex gadget. 

Proposition 2 Let Q he a weighted MaxLeaf instance, T he a spanning tree of Q and H he a 
vertex gadget of Q . Let T' he an out-tree orientation of T with root r* G V{G)\V{H). Then 
the following hounds hold: 

(i) H contains at most six weighted leaves ofT. 

(a) If T' contains at least one arc leaving V{H), then H contains at most four weighted 
leaves ofT. 
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Figure 3: A spanning tree with 24 = [3.75-6 + 1.5] weighted leaves yields a size 1 independent 
set. 

(in) If T' contains at least two arcs leaving V{H), then H contains at most three weighted 
leaves ofT. 

Proof: In the proof we will refer to the vertex labels of H as shown in Figure [T]||b). 

(i) {a,d,f} and {b,g,i} are vertex cuts of G, so both contain at least one non-leaf of a 
spanning tree. They are disjoint, so H contains at least two weighted non-leaves of T. 

(ii) Since every arc of T' that leaves V{H) is part of a dipath in T' that starts at the 
root, T contains a path P in H from one terminal of H to another, where all vertices of 
P are non-leaves. Suppose b is one of the ends of P. Then either P contains at least four 
weighted vertices, or P contains the vertices 6, c, / and i. In the second case the vertex 
cut {a, e, g} shows there is at least one more non-leaf, so in both cases we have found four 
weighted non-leaves. Now suppose g is one of the ends of P. If h is the other end this ensures 
that g and h are non-leaves, and the two disjoint vertex cuts {a, d, /} and {6, e, i} show there 
are at least two more weighted non-leaves. If i is the other end, P either has length at least 
four (in which case we are done), or it contains g, h and i. Then the vertex cut {a, d, /} 
shows there must be at least one more weighted non-leaf. Finally, if P goes from h to i, the 
two vertex cuts {b,f} and {a,e,g} show that there are at least four weighted non-leaves. 

(iii) Because there are at least two arcs leaving V{H), in this case T — L{T) contains a 
subgraph of H of one of the following two forms: it contains a tree Th that contains at least 
three terminals of H, or it contains two paths between disjoint terminal pairs of H. (Note 
that all vertices of these subgraphs are non-leaves.) In the latter case five weighted non-leaves 
are easily found by considering shortest path lengths. Similarly, five non-leaves are also easily 
found when {6, g, h} C V{Tf{) or {6, g, i} C V{Th)- If {b, h, i} C V{Th), four weighted leaves 
are only possible when a, d, e and g are leaves, but this is not possible since {a,e,g} is a 
vertex cut. Finally, when {g, h, i} C V(Th), the three vertex cuts {b, /}, {6, d, e} and {a, d, /} 
show there are at least two additional weighted non-leaves. □ 

In the remainder of this section, we will prove the next lemma, which shows that an 
independent set / of G of sufficient size can be constructed when a spanning tree T oi Q is 
given. The construction is illustrated in Figure [3l The constructed independent set consists 
of the single encircled vertex. Numbers indicate numbers of weighted leaves. The choice of 
the orientations is explained below. 

The intuitive idea behind the next proof is as follows. Not too many vertex gadgets in 
Q can contain six weighted leaves of a spanning tree T, since edges in vertex gadgets are 
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needed to connect T. In particular, such vertex gadgets cannot be adjacent and thus form 
our independent set. With a similar more delicate argument we will also show that not all 
vertex gadgets can contain four leaves of T. How much every vertex gadget contributes to 
'connecting T' is encoded by the out-degrees of vertices of G' in the proof below. The proof 
of the lemma consists of a number of claims. 

Lemma 3 Let Q he constructed from a cubic graph G on n vertices as shown in Section O 
If Q has a spanning tree T with i{T) > 3.75n + l.bx, then an independent set I of G with 
\I\ > X — ^ can be constructed in polynomial time. 

Let T be a spanning tree of Q with i{T) > 3.75 + 1.5x. To construct an independent set / of T 
G with the desired size, we will first use T to orient G' and G. Observe that there is some 
connection vertex of Q that is not a leaf of T. Choose r* to be such a vertex. Orient T as r* 
out-tree with root r*. An orientation of G' can be obtained from the out-tree T as follows: 
consider an edge uv £ E{G'), which was subdivided with a new vertex w for constructing Q. 
So uv corresponds to edges tiw and t2W of Q, with ti = tuv{u) and t2 = tuv{v)- uv is now 
oriented as follows: if {ti^w) G ■A.{T), then choose the orientation {u,v). If {t2,w) E ^{T), 
then choose the orientation {v,u). Observe that this uniquely determines the direction of uv 
in every case. Doing this for all edges of G' yields the orientation of G' . Since G is a subgraph 
of G', this also yields the orientation of G that we will use. 

The set / now consists of all vertices of G that have out-degree 0. Clearly this is an / 
independent set, and / can be constructed in polynomial time. Let rij denote the number of 
vertices of G with out-degree i, so |/| = no. Let be the number of vertices of G that have n- 
out-degree i in G' . Observe that since r* is not part of a vertex gadget, n'^ = 0. Note that Vi 
has out-degree d in G' if and only if T contains d arcs leaving Hi. So Proposition [2] shows that 
if Vi has out-degree 3 in G' , then T has at most three weighted leaves in the vertex gadget 
Hi, etc. This yields: 

Claim 1 The number of non- connection vertices ofQ that are weighted leaves ofT is bounded 
by Qu'q + An'i + 3n2 + 'in'-^ . 

Since T is an out-tree, every vertex of T is reachable from the root r* . Therefore every 
vertex of G' is reachable from r* in the chosen orientation (possibly by multiple dipaths). 
Observe that every connection vertex that is a leaf in T has out-degree in G'. Let z be the z 
number of connection vertices of G' that have an in-neighbor that is not a connection vertex. 

Claim 2 At most \z/2\ connection vertices of Q are leaves in T. 

Proof: Let Co-j , . . . , c^^ be the connection vertices of Q that are leaves in T, with Ui < crj+i 
for all i. All of these vertices have in-degree 3 in G', which accounts for k connection vertices 
that have an in-neighbor that is not a connection vertex. 

Consider Co-. and c^^^-^, for some i. Since these vertices have in-degree 3, they are not 
adjacent in G' . Therefore there is at least one connection vertex ci that lies between them on 
G (that is, ai < I < fij+i). G' contains a dipath P from r* to q, which clearly cannot contain 
C(j; or Co-j_^i as internal vertices. So unless r* also lies between Cj and Cj+i, P must contain 
a connection vertex between c^^ and Co-._^j that has an in-neighbor that is not a connection 
vertex. 
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Since the above argument can be applied for k different pairs of connection vertices and 
r* lies only between one such pair, this accounts for k — 1 additional such vertices. It follows 
that z>2k-l. □ 



A second way to interpret the parameter z is the following: there are exactly z vertices 
with different out-degrees in G and G' . In this case the out-degree in G' is one higher. This 
observation yields the following inequality. 

Claim 3 z + StIq + 2n'^ + n'2 = 3no + 2ni -|- n2. 

Proof: Let ki denote the number of vertices with out-degree i in G and out-degree 1 in G'. 
From 724 = 0, /e3 = follows. Vertices for which the out-degree increases this way correspond 
to in- neighbors of connection vertices in G\ so z = /cq + ^1 + ^2- In addition we have that 
n[ = rii — ki + ki-i. Substituting these expressions yields the stated equality. □ 

With the above observations, we can bound the number of weighted leaves of T. Let 
m = l.fm be the number of arcs of G. By counting in-degrees we have m = 3no -|- 2ni + n2- 

i{T) < 6n'Q + 4n'i + Sn'a + + \z/2] < 

\3n + 1.5|/| + l.Sn'o + n[ + z/2] < 

[3n + 1.5|/| + l.bno + m + 0.5n2l < 

\3n + 1.5|/| + 0.5m] = [3.75n + . 

Here we used Claim[Tl Claim[2l n = nQ-\-ni+n2 + n'^; \I\ = riQ > tlq] z/2+1.5nQ + n'i + 0.5n'2 = 
1.5no + ni -\- 0.5n2 (Claim [3]); m = 3nQ + 2ni + 71-2 and m = 1.5n, respectively. 

So if e{T) > 3.75n + 1.5x, then \3.75n + 1.5\I\] > £{T) > 3.75n + l.bx. Since G is a 
cubic graph, n is even. It follows that 3.75n -|- 1.5|/| is half integral, so 3.75n + 1.5|/| -|- 0.5 > 
[3.75n -|- 1.5|/|] > 3.75n -|- 1.52;, and thus |/| > x — ^. This concludes the proof of Lemma[3j 

5 Constructing a Spanning Tree from an Independent Set 

In this section we will prove the following lemma, which shows that a spanning tree T with 
enough weighted leaves can be constructed when an independent set / of G is given. The 
proof consists of a number of claims. 

The intuitive idea behind the proof is as follows. When given an independent set / of 
G, we can construct a spanning tree T of Q that does not use any vertex gadget Hi with 
Vi £ I for 'connecting T\ For arguing that we can still make T connected, we need to use 
the 3-coloring of G. We fix a connection vertex as root, and show that the red vertices can 
be reached from this root. This is needed to show that green vertices can be reached, which 
is in turn needed to show that blue vertices can be reached. 

Lemma 4 Let Q he constructed from a cubic graph G on n vertices as shown in SectionlM IfG 
has an independent set I with \I\ > x, then Q has a spanning tree T with i{T) > [3.75n-|-1.5xJ . 

Throughout the proof we will refer to the vertex coloring of G that was used for the construc- 
tion of Q. Let / be a maximal independent set of G with |/| > x. We use this to construct / 
a spanning tree with at least [3.75n -|- 1.5xJ leaves as follows. The construction is illustrated 
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Figure 4: A size 2 independent set yields a spanning connected subgraph with [3.75 • 6 + 1.5 • 
2j = 25 weighted leaves. 

in Figure m where I is represented by encircled vertices in G. First, for all v € I, orient all 
incident edges xv of G as {x,v), so every v € I has out-degree 0. This is possible since / 
is an independent set. For all edges that are not incident with a vertex from /, choose the 
direction from red to green, from green to blue or from red to blue, whichever applies. This 
yields the orientation of G. We extend this to an orientation of G' as follows: 

• If Vi has out-degree 0, 1 or 3 in G, we orient CiVi towards Vi. 

• If Vi has out-degree 2 in G, we orient CiVi towards q. 

• Let C be the set of connection vertices q in G' that now have an incoming arc {vi,Ci). C 
Let be the number of connection vertices q E C where Vi is green. For every < i < 

n — 1, the edge CjC(j_|_]^) mod n is directed towards C(^i^i^ mod n 

if \C n {co, . . . , Cj}| mod 2 = 

mod 2, and towards Cj otherwise. 

In Figure m C = {02,04,05}. C is represented by encircled vertices of G'. Of these vertices, 
only C2 has a green in-neighbor, so g'^ = 1. Therefore cqCi is oriented towards cq, etc. 

We start with two simple observations on these orientations of G'. If a vertex Vi has 
out-degree 1 in G, it retains out-degree 1 in G', and if it has out-degree 2 in G it receives 
out-degree 3 in G'. If it has out-degree 3 in G it retains out-degree 3 in G'. This yields: 

Claim 4 Vertices Vi have out- degree 0,1 or 3 in G' . 

For red vertices Vj, either dQ{vi) = (if Vi G /), or dQ{vi) = 3 (if Vi /), so in either case 
{ci,Vi) S A{G'). Summarizing: 

Claim 5 If Vi is red, then Ci ^ C. 

Let denote the number of vertices Vf^ with dQ{vk) = d. rid 
Claim 6 G' contains at least \n2/2\ vertices Ci with d~^{ci) = 0. 

Proof: Observe that vertices Ci £ C with i > 1 have in-degree 1 or in-degree 3 in G', because 
of the parity based orientation of edges between connection vertices. Recall that there is at 
least one red vertex, so vq is red and cq C (Claim [5]). Therefore all vertices in C have 
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in-degree 1 or 3, in alternating order of increasing index. Since \C\ = n2, it follows that there 
are at least [?i-2/2j connection vertices with in-degree 3 (and out-degree 0). □ 

Let r*= Co if is even, and r* = Cj—i if it is odd. In Figure HI = \ so r* = Cr-i = c\. r* 

Claim 7 In the chosen orientation of G' , every vertex is reachable from r* . 

Proof: Out-degrees will refer to G in this proof. First we will show that every vertex Vi of 
G' is reachable from some connection vertex. If dQ{vi) ^ 2, then Vi has a connection vertex 
as in-neighbor, so the statement is clear. If dQ{vi) = 2, then Vi has an in-neighbor Vx in G', 
with Vx /, that must be red or green. If Vx has a connection vertex as in-neighbor, we have 
proved the statement. Otherwise, Vx has an in-neighbor Vy again, which then must be red. So 
Vy must have a connection vertex as in-neighbor. In any case, we have found a dipath from 
some connection vertex to Vi. 

A connection vertex Cj will be called red, green or blue when its unique (in- or out-) 
neighbor Vi is red, green or blue respectively. We will now prove that all connection vertices 
Cj are reachable from r* in G'. 

CASE 1: a is red. 

Since there are no red vertices q G C (Claim[5]), cq, ci, . . . , c^-i is a dipath in G' if g'^ is 
even, and c^-i, Cr-2, . . . , cq is a dipath if g'^ is odd. So we have chosen r* such that all 
red connection vertices are reachable from r*. 

CASE 2: Ci G C is green. 

Let Vi be the (green) in-neighbor of Cj. The argument we have used above shows that 
Vi is reachable from some red connection vertex, which in turn is reachable from r* as 
shown in case 1. 

CASE 3: Ci C is green. 

Cj has a connection vertex as in-neighbor (either Ci_i or Cj+i). If Ci_i is its in-neighbor, 
then G' either contains a dipath Cr_i, . . . , Cj, or a dipath Cj, Cj+i, . . . , Cj with j < i and 
Cj € C. Both of these dipaths start at a reachable vertex (by case 1 and 2) so Cj is 
reachable from r*. If Cj+i is the in- neighbor of Cj, then the number of C vertices in 
{co,...,Ci} has different parity than the number of green vertices in C. Since all C 
vertices in {cq, . . . ,Cj} are green (Claim [5|), this implies that there is at least one more 
green vertex in C. So there exists a dipath Cj,Cj-i, . . . ,Ci with j > i, Cj green, and 
Cj € C. Cj is reachable from r* by case 2, so Cj is reachable as well. 

CASE 4: Ci G C is blue. 

By the same argument as earlier, the blue in- neighbor Vi of Ci is reachable from a red 
or green connection vertex, which is reachable from r* by case 1, 2 or 3. 

CASE 5: Ci C is blue. 

Similar to the reasoning in case 3, we may trace a path back from Ci consisting of 
connection vertices, until we find a dipath starting at a vertex Cj, where Cj is either red 
or part of C. (This path may also be cq, c^-i, c„_2, ■ ■ ■ ,Ci, so j = 0.) Case 1, 2 and 4 
show that Cj and thus Cj is reachable from r*. 

Now we have considered all cases for connection vertices. It follows that all vertices of G' are 
reachable from r* . □ 
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Figure 5: Using out-degrees to construct a spanning tree. 



Whenever we refer to the out-degree or in-degree of vertices below, this refers to G', not 
to G, unless explicitly noted otherwise. We use the orientation of G' to construct a spanning 
tree T' of Q as follows. First we construct a spanning connected subgraph T: 

1. For every vertex gadget in Q, Figure [5] shows which subset of the edges should be chosen 
in r, depending on the out-degree and out-neighbor set of the corresponding vertex Vi 
in G' . (Note that only out-degrees 0, 1 and 3 have to be considered by Claim HI) 

2. Every edge of Q that is not part of a vertex gadget is added to T. 

3. For every vertex Cj that has in-degree 3 in G', delete the two incident T-edges that do 
not correspond to the arc (wj, Cj) of C, making Cj a leaf of T. 

4. Delete edges of T until no cycles remain, to obtain graph T' . T' 

T denotes the graph as it is after Step 3 above. The following claim already shows for many T 
vertices of Q that they are reachable from r* in T. 

Claim 8 // G' contains a dipath P' = r* , . . . , x,y with d'^{y) > 1, then T contains a path 
from r* to txy{y). 

Proof: First, for every arc (u, v) of P' we add the corresponding length 2 path in Q to P. To be 
precise, this is the path tuv{u),x,tuv{v), where x is the vertex resulting from the subdivision 
of uv during the construction of Q. Observe that both of these path edges are also part of 
T: in Step 3 of the construction of T some edges that are not part of vertex gadgets are 
removed from T, but only those that are incident with a vertex Cj with in-degree 3, and thus 
out-degree 0. Clearly such vertices cannot be internal vertices of P\ and by our assumption, 
the end vertex y of P' also has out-degree at least 1. At this point P may not be a path 
yet; it can consist of a sequence of paths where one path ends at a terminal ti of a vertex 
gadget Hi , and the next path starts at another terminal t2 of Hi . Joining such paths together 
is easy when d'^{vi) = 3: Figure EJb) shows the edges of T that are part of Hi; observe 
that for every terminal pair ti and t2 a path from ti to t2 exists in T through Hi. So it 
suffices to prove that P' contains no internal vertices vj with d~^{vj) ^ 3. Clearly all internal 
vertices have out-degree at least 1. No vertices Vj of G' have out-degree 2 (Claim U]), so we 
only have to consider the case that d'^{vj) = 1. Now we will use that we started with a 
maximal independent set I: because I is maximal, every vertex that is not in / has at least 



11 



one neighbor in I. So by choice of the orientation of G, if Vj has out-degree 1, its out-neighbor 
Vk is in /, and (^^{vk) = 0. The dipath P' cannot contain Vk as internal vertex, and by choice 
of P' , also not as end vertex y. Hence P' contains no vertices Vj with out-degree 1. This 
concludes the proof. □ 

Using the previous two claims, T can be shown to be connected: 

Claim 9 All vertices u G y{Q) are reachable from r* within T. 

Proof: We consider four cases for u. 

CASE 1: ti is part of a vertex gadget Hi, with d^{vi) > 1. 

Figure El^b) and (c) show that in every case, there is an arc {w,Vi) G A{G') such that T 
contains a path from t = twviivi) to u. So we only need to show that t can be reached 
from r* within T. By Claim[71 G' contains an {r* ,w)-dipath, which then yields a dipath 
P' = r* , . . . ,w,Vi. From Claim [8] it now follows that T contains a (r*, M)-path. 

CASE 2: u is part of a vertex gadget Hi with d^ivi) = 0. 

We again consider an arc {w^vi) G such that T contains a path from t — tyj^^j^ivi) 

to u (such an arc exists, see Figure [5]^a)). Let t' = t^viiw). If t' is part of a vertex 
gadget, in case 1 we showed that t' can be reached from r* in T, which shows u can 
be reached. Otherwise, t' = Cj with d'^{cj) > 1. Claim [7] shows that G' contains an 
(r*, Cj)-dipath, which yields an (r*,t')-path in T (Claim [8]) and thus an (r*,u)-path. 

CASE 3: -u = Q. 

If d^(cj) > 1, Claim[7|shows that G' contains an (r*, Cj)-dipath, which yields an (r*,Cj)- 
path in T by Claim [HI If d^{ci) = 0, then the construction of T shows that both edges 
of Q corresponding to the arc (vi, Cj) of G' are part of T. By Case 1, every vertex of the 
vertex gadget Hi corresponding to Vi is reachable from r* in T, so q is reachable. 

CASE 4: d{u) = 2 and u is not part of a vertex gadget. 

Here u is the vertex resulting from the subdivision of an edge xy. Let (x, y) be the 
orientation of this edge in G' . If x = for some k, then Case 3 shows that an (r*, c^)- 
path exists in T. This can be extended to the desired path; c^n G E{T) since d'^{ck) > 1. 
Otherwise, x G V{Hi), where d'^{vi) > 1. Then case 1 or 2 shows that an (r*,x)-path 
exists in T, which can be extended again. □ 

Since Claim [9] shows that T is connected, clearly T' is connected as well. Since in addition 
T' contains no cycles, T' is a spanning tree of ^. It remains to prove that it has the desired 
number of leaves. Figure [5] shows that a vertex Vi contributes six leaves to T if dQ,{vi) = 0, 
four leaves if dQ,{vi) = 1 and three leaves if dQ,{vi) = 3. In addition, every vertex Cj with 
in-degree 3 in G' is a leaf of T by Step 3 of the construction of T. Claim [6] shows that there 
are at least [n2/2j such vertices. Recall that rid denotes the number of vertices that have 
out-degree d in G. In addition let n'^ denote the number of vertices that have out-degree d in 
G' . Observe that hq + ni + n2 + = n, and let m = 1.5n = Shq + 2ni + ?i2 be the number 
of edges of G. Together this yields 

1{T) > 6n[, + 4n'i + Sn'g + [n2/2j = 6no + 4ni + 3^2 + 3n3 + [^2/2] = 

[3n + 3no + ni + 0.5n2j = \ 2,n + 1.5no + 0.5mJ > [3.75n + 1.5xJ . 

For the last step we used that every vertex u £ I has out-degree in G and that |/| > x. 
This concludes the proof of Lemma HI 
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6 Conclusion of the Proof 



Theorem 5 Cubic MaxLeaf is APX-hard. 

Proof: We show that for every e > 0, a (1 — e)-approximation algorithm for cubic MaxLeaf 
yields a (1 — 141e)-approximation algorithm for Cubic MIS. Let G be a Cubic MIS instance 
on n vertices, which has a maximum independent set of size x. Observe that since G is cubic, 
X > n/4. From G, we construct a Weighted MaxLeaf instance Q as shown in Section [3l Q has 
a tree with at least [3.75n + 1.5xJ weighted leaves (Lemma|3|), and it can be checked that it 
has y = 4.5n vertices of degree 2. Let r = 3.75n + 1.5a; — [3.75n + 1.5xJ . Note that since n 
is even, the rounded value is half-integral so r < 0.5. From Q, we construct a Cubic MaxLeaf 
instance H by replacing degree 2 vertices as shown in Section [3l Then H has a tree with at 
least 3.75n + 1.5x — r + 3y = 3.75?7, + 1.5x — r + 13. 5n leaves (Lemma [IJ. 

Now suppose we have a (1 — e)-approximation algorithm for cubic MaxLeaf. In H, this 
algorithm will find a tree T with at least (1 — e)(3.75?i + 1.5x — r + 13.5n) leaves. By Lemma[l] 
again, this yields tree T' of Q with at least (1 — e)(3.75n + 1.5x — r + 13. 5n) — 13. 5n weighted 
leaves. So, using x > n/4, we obtain: 



where 7 = 70. 5e. Now we consider two cases: 

If 7X < 0.5, then £{T') > 3.75n + 1.5a; - 0.5 - 0.5 = 3.75n + 1.5(a; - |). (Here we used 
r < 0.5.) By Lemma [3l we can construct an independent set I for G with |/| > x — | — ^ 
(note that the inequality is again strict), x is integer, so \I\ > x. Hence in this case we find 
an optimal independent set. 

On the other hand, if 7X > 0.5, then also 7a; > r, so i{T') > 3.75n + 1.5a; — 7X — 7a; = 
3.75n + 1.5(x — |7a;). So by Lemma [3] again, we find / with |/| > x — I7X — | > x — 27X. In 
this case we have an (1 — 27) = (1 — 141e) approximation. Since Cubic MIS is APX-hard [T], 
the APX-hardness of Cubic MaxLeaf follows. □ 

We remark that this reduction is an L-reduction as introduced in [21]. Similarly, using 
the fact that cubic graphs on n vertices have a spanning with at least n/4 + 2 leaves jl5j . 
we find that a (1 + e)-approximation algorithm for MinCDS yields a (1 — 3e)-approximation 
algorithm for Cubic MaxLeaf on the same graph, so: 

Corollary 6 Cubic MinCDS is APX-hard. 

Proof: We consider the trivial reduction from cubic MaxLeaf. Let G be a cubic graph on n 
vertices for which we wish to find a spanning tree with maximum number of leaves. Let I be 
the maximum number of leaves possible for G. Since G is cubic, / > n/4 + 2 |15j . 

G then has a connected dominating set of size at most n — I. A (1 + e)-approximation 
algorithm for MinCDS returns a solution S with 



So S can be used to find in polynomial time a spanning tree with at least / — 3el leaves, which 
together yields a (1 — 3e)-approximation algorithm for cubic MaxLeaf. The APX-hardness of 



£(r') > 3.75n + 1.5x - r - e(3.75n + 1.5x - r + 13.5n) = 

3.75n + 1.5x — r — e(17.25n + 1.5x — r) > 
3.75n + 1.5x — r — e(69x + 1.5x) = 3.75n + 1.5x — r — 7X, 



\S\<{l + e){n-l) 



n — I — el + en<n — I — el + 4e/ = n — I + 3el. 



Cubic MinCDS follows. 



□ 
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